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ABSTRACT 

This  paper  concerns  the  distribution  of  the  sum  of 
k largest  observations  in  a sample  of  m observations  from 
a gamma  distribution  with  n degrees  of  freedom.  If  n is 
an  integer,  the  density  and  cdf  of  the  distribution  are  given 
as  a linear  function  of  gamma  density  functions.  If  n is  not 
an  integer,  an  approximate  distribution  of  the  same  form  is 
obtained.  The  distribution  of  the  sum  arises  in  a problem  of 
selecting  variables  in  a multiple  regression  analysis. 


Key  words:  Gamma  Distribution;  Laplace  Transform;  Linear 
Regression. 
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1.  Main  result.  Let  denote  the  r-th  smallest  observa- 
tion in  a smaple  of  size  m from  a gamma  distribution  with 

m 

n degrees  of  freedom,  and  let  Y.  ■ J X denote  the 

r-m-k+1  ^ 

sum  of  the  k largest  observations  in  the  sample.  First  we 
obtain  the  Laplace  transform  of  the  distribution  of  Yj^. 

By  inverting  the  transform  we  derive  the  density  and  cdf 
of  the  distribution.  If  n is  a positive  integer  then  the 
density  and  the  cdf  of  Yj^  will  be  given  as  a linear  function 
of  gamma  density  functions. 

Let  gj^(x)  and  denote  the  density  and  cdf,  respec- 

tively, of  the  gamma  distribution  with  n degrees  of  freedom. 

The  Laplace  transform  of  the  distribution  is  given  by 

f e'®^  d G (x)  = (l+e)'*'  , e > 0 . 

J 0 " 

If  Y is  distributed  according  to  the  gamma  distribution,  the 
Laplace  transform  of  the  conditional  distribution  of  Y,  given 
Y 2.  X,  is  given  by 

0^(0)  - (1-G^(x))'l  I”  e'®yd  G^  (y) 

- (l+e)'*'  (l-Gj^((l  + 0)x))  (1-G^(x))‘^  . 

^(0)  denote  the  Laplace  transform  of  Yj^,  and  let 

H(x)  denote  the  cdf  of  X . . Given  X_.  , ■ x,  Y^  is  distributed 

m- K m-K  K 

as  the  sum  of  k independent  observations  from  the  conditional 
distribution  of  Y,  given  Y^x.  Therefore 


(1.1)  L, 


(0)  - I 

^ 0 


♦^*(6)  dH(x) 


'(V)  r G„"''‘'‘(x)(l-G„(x))'‘dc„(x) 


,(“■11  (l.e)'"'"  f (1-G„((1.6)x))''g 
. K J J 0 " 


k_  m-k-l 


(x)dG^(x) 


(1+0) 


1 < k < m 


k ■ m . 


Let  n be  a positive  integer.  Integrating  by  parts  we 


(1.2)  1 - Gj^(x)  - gj(x)  ♦ g2(x)  ♦...♦  gn(x) 

n-1  o 

“ e'*  I 5T  • 
a«0  ' 

Let  c denote  the  coefficient  of  x^  in  the  expansion  of 
uv 


for  nonnegative  integer  values  of  u and  v.  The  numbers  c^^ 
can  be  computed  recursively  from  the  following  formula. 

u 

c ■ ^ t u < n- 1 
uv  uT  - 

c,  . ■ 0 , u>n 
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c - 0 , u>(n-l)v 

uv 

n- 1 


1 

c ■ y c , , n<u<(n-l)v,  v>l. 

uv  ^ u-a  v-1  ' - * * 


From  (1.1) t using  (1.2),  we  have  after  simplification 

m-k-1  (n-l)k  (n-l)r  _ 

(1.3)  L^(9)  . (-1) 


c^^Cyj.dc.l.r)''’''''"  Itu.v.n)  “‘J'l 


(l.e)-"''*'‘{i.aye)-“-''-" 


for  1 < k < m,  where  o ■ k/(l+r+k). 

— r 

Let  W ■ U + aj.V,  where  U and  V are  random  variables 
independently  distributed  according  to  the  gamma  distribution 
with  nk  - u and  n ♦ u + v degrees  of  freedom,  respectively. 


B 


I 


Using  (1.2)  we  see  that  H(x)  is  given  as  a linear  function 
of  the  gamma  density  functions. 


The  Laplace  transform  of  the  distribution  of  W is  equal 


to  (l+6)"”^(l+a  0)"’''“''^.  Therefore,  from  (1.3)  we  obtain  by 


inversion  the  cdf  of  Yj^,  given  by 

r-  I"*  m-k-1  (n-l)k  (n-l)r  - „ u i 


r(u+v+n)  (k+l+r)‘“''^‘"  c^^j^  c^^  H^^(x),llk<m  . 


For  k = m we  have  F_(x)  - G„„(x).  Thus  F,  (x)  is  given  as  a 


m'  ' nm'  ' k 

linear  function  of  the  gamma  density  functions.  By  differen- 


tiation we  obtain  the  density  function  of  Yj^  of  the  same  form. 
From  (1.3)  we  obtain  the  £-th  moment  of  Yj^,  given  by 


o « r-  o m-k-l  (n-l)k  (n-l)r  _/■_  v 


r(u+v»-n)  (k+l+r)  “ c^^  E (W^) 


where 


E(W^) 


(^)c,t  r (nk-u-*-^- 1)  r(n-»u»v*t) 
t«0  ^ ^ r(nk-u)  r(n*u+v) 


For  n =*  1 and  £ ■ 1,  the  above  formula  checks  with  the  known 
result  (see  e.g.  David  (1970)  2.7.3) 


EOfJ  - I 


m 


i*m-k+l  j*l 


I (m-j+1) 


-1 


w. 


.r. 

i 


(1.7) 
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Now  we  consider  the  case  in  which  n is  not  an  integer. 

Let  n - n*  ♦ v,  where  n*  denotes  the  integral  part  of  n and 

* 

0 < V < 1.  For  any  positive  integer  t > n , let 

t t-1 

(1.8)  A (x)  •I  t,  g (x)-[  g (x)  ♦ 1 

r-n  +1  r-1 


• ^n  • 

We  show  that  A^(x)  is  a probability  distribution  function  for 
sufficiently  large  values  of  t.  The  derivative  of  A^(x)  with 
respect  to  x is  given  by 


(1.9)  A^(x)  - 


t-n 


t-n+1 


n-1  ..-X  r 1 x"  " . x"  “ * , 

^ ® * r(t+v)  r(t+i)  ^ 


Let  P^(x)  denote  the  quantity  inside  the  square  bracket  on  the 
right  side  of  (1.9).  The  derivative  of  (x)  with  respect  to 
X changes  sign  from  negative  for  positive  as  x varies  from  0 
to  *.  Hence  P^(x)  is  minimized  for  x * Xq,  say,  given  by 


1-v . cvil*? ..r.cyiil.. 

*0  (t-n+l)  r(t+v) 


- t^  '^e'^'^  for  large  t. 


We  have 
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(t-n  ♦!)  r(t^v) 

r i e**  t " 7 > 0 for  large  t. 

r.(n)  /TiT 

Therefore,  there  exists  a vilue  of  t - tp,  say,  depending 
on  n such  that  P^(x)  > 0 for  all  x and  t ^ t^.  Since 
A^(0)  = 0 and  A^(66)  » 1,  it  follows  that  A^(x)  is  a prob- 
ability distribution  function  on  [0,<»)  for  t ^ t^. 

Since  G^+y(x)  uniformly  in  x,  as  t->®,  it  is 

seen  from  (1.8)  that  G(x)  -*■  A^(x)  uniformly  in  x.  Also, 
A^(x)^G^(x),  since  G^^^(x)^G^^ j (x) . We  have  shown  the 
following  result. 

Theorem  1.1.  Let  t > n be  a positive  integer.  Then 
A^(x)£Gjj(x)  for  all  x 0,  and  A^(x)-^G^(x)  uniformly  in  x, 
as  t-^oo.  There  exists  a value  of  t depending  on  n,  such 
that,  A^(x)  is  a probability  distribution  function  on  (0,*) 
for  t > tp. 

The  above  theorem  shows  that  when  n is  not  an  integer 
we  can  approximate  G^(x)  by  the  distribution  function  A^(x), 
given  as  a linear  function  of  gamma  density  functions  gj.(x) 
where  r is  integer  valued.  Therefore,  when  n is  not  an 
integer  we  approximate  the  distribution  of  yj^  by  the  dis- 
tribution of  the  sum  of  k largest  order  statistics  from  a 
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sample  from  the  distribution  Aj(x).  The  Laplace  transform 
of  the  distribution  of  the  sum  is  obtained  by  substituting 
A^(x)  for  Gj^(x)  in  (1.1)  and  is  given^by.. 

(1.11)  L*(0)  ” Ip  (I^^|gr((l^e)x) 

- I i, 

r-n  ♦!  r-n  ♦! 

- l'**  g,(x))"'‘''*  g„(x)  dx  . 

r-1  ^ ” 

The  right  side  of  (1.11)  is  seen  after  simplification  to  be 
of  the  same  form  as  (1.3).  Inverting  the  transform  we  obtain 
the  distribution  function  in  the  same  form  as  (1.5). 

If  a few  parameters  of  the  distribution  of  Yj^  are  required, 
such  as  the  quantiles  , as  in  the  application  considered  below, 
where  n is  not  an  integer,  an  alternative  method  is  to  inter- 
polate from  the  corresponding  values  given  for  adjacent  Integer 
values  of  n. 

Table  I below  shows  the  901  and  951  upper  points  of  the 
distribution  of  for  certain  values  of  k,m  and  n.  The  figures 
given  in  the  table  for  n ■ I5  were  obtained  by  the  Monte-Carlo 
method.  The  table  is  not  comprehensive.  It  is  given  only  for 
illustration. 
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2.  Application.  The  problem  of  selecting  a subset  of  in- 
dependent or  predictor  variables  in  regression  analysis  has 
been  of  long  interest  to  applied  statisticians,  and  because 
of  the  current  availability  of  high-speed  computation  facility, 
this  problem  has  received  added  attention  in  the  recent  statis- 
tical literature.  Recently,  Hocking  (1976)  has  published  an 
expository  paper  on  the  subject  wherein  he  has  described 
various  aspects  of  the  problem.  The  paper  includes  an  ex- 
tensive list  of  references  to  important  publications  in 
the  area. 

The  following  situation  arises  in  a problem  of  selecting 
a subset  from  a given  set  of  predictor  variables  in  multiple 
regression  analysis.  There  are  given  m predictor  variables 
^1’  ■*’’  * dependent  variable  Y.  The  predictor  vari- 

ables and  the  dependent  variable  are  jointly  distributed  ac- 
cording to  a multivariate  normal  distribution.  It  is  required 
to  select  a subset  of  k variables  from  the  set  of  predictor 
variables  which  has  "most"  prediction  value.  We  call  it  the 
best  subset.  There  are  (™)  subsets  to  choose  from.  Suppose 
that  the  (^)  multiple  correlations  between  Y and  each  subset 
of  the  predictor  variables  are  computed  from  a sample  of  M 
observations.  Let  denote  the  largest  among  them.  It  is 
a common  practice  to  select  the  subset  associated  with  Rj^  as 
the  best  subset.  The  distribution  of  Rj^  which  is  required  for 
a test  of  significance  for  example,  is  mathematically  in- 
tractable. Theorem  2.1  below  shows  that  if  Y,X, , ...,  X are 

1 m 


L 
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independently  distributed  then  (M-1)  is  asymptotically 
distributed  for  large  M,  as  the  sum  of  k largest  observations 
in  a sample  of  m observation  from  a chi-squared  distribution 
with  one  degree  of  freedom  (xj)*  The  distribution  of  the  sum 
is  given  by  the  results  of  the  preceeding  section. 

Let  Y and  denote  the  vectors  of  the  deviations  of 
the  observed  values  of  Y and  from  their  respective  mean 
values  in  the  sample.  Consider  a subset  of  predictor  vari- 
ables, say,  Xj,  ...,  Xj^.  Let  X ■ (Xj,  ...,  Xj^) . The  square 
of  the  sample  multiple  correlation  coefficient  between  Y and 
(Xj,  . . . , Xj^)  is  given  by 


t 

i 

I* 


J 


•n 
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- (Y'X(X'X)’^X'Y)/(Y'Y) 


Let  fY,X^ Xjjj)  ~ N(u,E).  Without  loss  of  gener- 


ality we  can  assume  that  u ■ 0 and  that  ^ is  a correlati* 
matrix.  Furthermore,  suppose  that  I “ I,  the  identity 


matrix,  that  is,  the  variables  are  independently  distributed, 


Then,  by  the  law  of  large  numbers 


-1  P 

(M-1)  (X'X)  I as  M «. 


Therefore,  asymptotically  for  large  M 


(2.1) 


(M-l)R^  d (Y'X.)^/(Y'Y)  . 

i-1  - ^ 


i»l  ^ 


where 


Vf  = (Y'X.)^/fY'Y) 


Now  Vj , ...,  are  random  variables  independently 

and  identically  distributed  as  Xj.  From  (2.1)  it  follows  that 

2 

(M-l)R^  asymptotically  is  distributed  as  the  sum  nf  k largest 


order  statistics  in  a sample  of  m observations  from  Xj  distri- 


bution. This  result  is  stated  in  the  following  theorem. 


Theorem  2.1.  If  Y,  ...,  x^  are  normally  and  in- 


dependently distributed  then  (M-l)R^  is  asymptotically  distri- 
buted as  the  sum  of  k largest  order  statistics  in  a sample  of 


size  m from  Xj  distribution. 


r 
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